Dependency Relations as Source Context in Phrase-Based SMT

نویسندگان

  • Rejwanul Haque
  • Sudip Kumar Naskar
  • Antal van den Bosch
  • Andy Way
چکیده

The Phrase-Based Statistical Machine Translation (PB-SMT) model has recently begun to include source context modeling, under the assumption that the proper lexical choice of an ambiguous word can be determined from the context in which it appears. Various types of lexical and syntactic features such as words, parts-of-speech, and supertags have been explored as effective source context in SMT. In this paper, we show that position-independent syntactic dependency relations of the head of a source phrase can be modeled as useful source context to improve target phrase selection and thereby improve overall performance of PB-SMT. On a Dutch—English translation task, by combining dependency relations and syntactic contextual features (part-of-speech), we achieved a 1.0 BLEU (Papineni et al., 2002) point improvement (3.1% relative) over the baseline.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Source Phrase Segmentation and Translation for Japanese-English Translation Using Dependency Structure

There are various approaches to statistical machine translation (SMT). In particular, phrase-based SMT (PBSMT) is used as a de facto standard for many language pairs because it works robustly across languages and it is easy to implement. However, the results of PBSMT can include ungrammatical sentences, since it typically does not take syntactic structure into account. To overcome this problem,...

متن کامل

Improving Fluency by Reordering Target Constituents Using MST Parser in English-to-Japanese Phrase-based SMT

We propose a reordering method to improve the fluency of the output of the phrase-based SMT (PBSMT) system. We parse the translation results that follow the source language order into non-projective dependency trees, then reorder dependency trees to obtain fluent target sentences. Our method ensures that the translation results are grammatically correct and achieves major improvements over PBSM...

متن کامل

Feature Engineering in Persian Dependency Parser

Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...

متن کامل

Soft Dependency Matching for Hierarchical Phrase-based Machine Translation

This paper proposes a soft dependency matching model for hierarchical phrase-based (HPB) machine translation. When a HPB rule is extracted, we enrich it with dependency knowledge automatically learnt from the training data. The dependency knowledge not only encodes the dependency relations between the components inside the rule, but also contains the dependency relations between the rule and it...

متن کامل

Supertags as Source Language Context in Hierarchical Phrase-Based SMT

Statistical machine translation (SMT) models have recently begun to include source context modeling, under the assumption that the proper lexical choice of the translation for an ambiguous word can be determined from the context in which it appears. Various types of lexical and syntactic features have been explored as effective source context to improve phrase selection in SMT. In the present w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009